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CLAIMS 

1 1 . (Original) A fault tolerant computer system for executing one or more jobs on one or more 

2 nodes, comprising, 

3 a hierarchy of monitors for monitoring operations in the computer system including, 

4 one or more first monitors for monitoring first operations and, for any particular 

5 one of said first operations that fails, for restarting another instance of said 

6 particular one of said first operations, 

7 one or more second monitors for monitoring said first monitors and, if any partic- 

8 ular one of said first monitors fails, for restarting another instance of said 

9 particular one of said first monitors. 

tfi 2. (Original) The system of Claim 1 wherein, 

f£ said one or more of said second monitors are monitored by at least one of said first moni- 

f-B tors and, if any particular one of said second monitors fails, said at least one of 

\4 said first monitors restarts another instance of said particular one of said second 

! ? 5 monitors. 

s\ 3. (Original) The system of Claim 2 wherein one or more of said second monitors operates to 
commit suicide if more than one of said another instance of said particular one of said second 

□ monitors is restarted. 

O 

1 4. (Original) The system of Claim 1 wherein, 

2 said nodes operate to execute processes in a service unit, a communication unit and a re- 

3 source management unit. 

1 5. (Original) The system of Claim 1 wherein each of said nodes includes a computer having 

2 an operating system, wherein pluralities of nodes form clusters and wherein each cluster has a 

3 corresponding instantiation of said hierarchy of monitors for monitoring operations in the com- 

4 puter system. 
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6. (Original) The system of Claim 5 wherein each instantiation of said hierarchy of monitors 
includes, 

a first instantiation of said one or more first monitors for monitoring first instantiation 



7. (Original) The system of Claim 5 including first and second instantiations and wherein, 

said one or more of said second monitors of said second instantiation are monitored by at 
least one of said first monitors of said first instantiation and, if any particular one 
of said one or more of said second monitors of said second instantiation fails, for 
restarting another instance of said particular one of said one or more of said sec- 
ond monitors of said second instantiation. 

8. (Original) The system of Claim 1 wherein, 

said second monitors maintain a record of particular ones of the first monitors that are 
active and corresponding active particular ones of said first operations being mon- 
itored by said particular ones of the first monitors. 

9. (Original) The system of Claim 8 wherein, 

said second monitors use said record to ensure that active particular ones of said first op- 
erations monitored by a failing one of said particular ones of the first monitors 
that are active is monitored by a new instance of said failing one of said particular 
ones of the first monitors that are active. 
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operations and, for any particular one of said first instantiation operations that 
fails, for restarting another instance of said particular one of said first instantiation 
operations, 



a second instantiation of said one or more second monitors for monitoring said first mon- 
itors of said first instantiation and, if any particular one of said first monitors of 
said first instantiation fails, for restarting another instance of said particular one of 



said first monitors of said first instantiation. 




1 10. (Original) The system of Claim 1 wherein said hierarchy of monitors includes, 

2 one or more additional monitors for monitoring said first monitors or said second moni- 

3 tors, and, if any particular one of said first monitors or said second monitors fails, 

4 restarting another instance of said particular one of said first monitors or said sec- 

5 ond monitors. 

1 11. (Original) The system of Claim 1 0 wherein said hierarchy of monitors includes, 

2 one or more other monitors for monitoring said first monitors, said second monitors or 

3 said additional monitors, and, if any particular one of said first monitors, said sec- 

4 ond monitors or said additional monitors fails, restarting another instance of said 

5 particular one of said first monitors, said second monitors or said additional moni- 
es 1 

6 tors. 

[11 

~$ 12. (Original) The system of Claim 1 wherein, 

=2 said first operations are jobs running on said nodes for providing services and, for any 

_3 particular one of said jobs that fails, one of said first monitors restarts another in- 

*4 stance of said particular one of said jobs. 

2 13. (Original) The system of Claim 12 wherein said jobs implement e-commerce transaction 

□ services. 

1 14. (Original) The system of Claim 12 wherein said jobs implement transaction services for 

2 financial instruments. 

1 15. (Original) The system of Claim 12 wherein said first monitors are host agents for monitor- 

2 ing operations of a plurality of jobs on a plurality of nodes where each job is monitored by only 

3 one of said host agents. 
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16. (Original) The system of Claim 12 wherein said first monitors are one or more agents op- 
erating on a first level, each of said agents for monitoring operations of jobs on nodes where 
each job is monitored by only one of said agents. 

17. (Original) The system of Claim 12 wherein, 

said first monitors are one or more agents operating on a first level, each of said agents 
for monitoring operations of jobs on nodes where each job is monitored by only 
one of said agents, and 
said one or more second monitors includes one or more local coordinators operating on a 
second level where each local coordinator monitors one or more of said agents. 

18. (Original) The system of Claim 12 wherein, 

said first monitors are one or more agents operating on a first level, each of said agents 



19. (Original) The system of Claim 12 wherein, 

said first monitors are one or more agents operating on a first level, each of said agents 
for monitoring operations of jobs on nodes where each job is monitored by only 
one of said agents, and wherein a particular one of said agents runs on a particular 
one of said nodes where a job monitored by said particular one of said agents runs 
on other of said nodes than said particular one of said nodes. 
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for monitoring operations of jobs on nodes where each job is monitored by only 
one of said agents, and wherein a particular one of said agents runs on a particular 
one of said nodes where a job monitored by said particular one of said agents 



runs. 




1 20. (Original) The system of Claim 12 wherein, 

2 said first monitors are one or more agents operating on a first level, each of said agents 

3 for monitoring operations of jobs on nodes where each job is monitored by only 

4 one of said agents, and wherein a particular one of said agents runs on a particular 

5 one of said nodes where a job monitored by said particular one of said agents 

6 runs, 

7 said second monitors are one or more local coordinators operating on a second level, 

8 each of said local coordinators for monitoring operations of agents, and wherein a 

9 particular one of said local coordinators runs on a particular one of said nodes 
10 where an agent monitored by said particular one of said local coordinators runs. 

□ 21 . (Original) The system of Claim 12 wherein, 

Iff said first monitors are one or more agents operating on a first level, each of said agents 

'% for monitoring operations of jobs on nodes where each job is monitored by only 

H one of said agents, and wherein a particular one of said agents runs on a particular 

±5 one of said nodes where a job monitored by said particular one of said agents 

jL6 runs, 

*7 said second monitors are one or more local coordinators operating on a second level, 

j8 each of said local coordinators for monitoring operations of agents, and wherein a 

P particular one of said local coordinators runs on a particular one of said nodes 

10 other than where an agent monitored by said particular one of said local coordina- 

1 1 tors runs. 



1 22. (Original) The system of Claim 12 wherein, 

2 said first monitors are one or more agents operating on a first level, each of said agents 

3 for monitoring operations of jobs on nodes where each job is monitored by only 

4 one of said agents, 

5 said second monitors are one or more local coordinators operating on a second level, 

6 each of said local coordinators for monitoring operations of agents, 

7 and wherein said hierarchy of monitors includes, 

8 one or more third monitors for monitoring said one or more second monitors and, for any 

9 particular one of said second monitors that fails, restarting another instance of 

10 said particular one of said second monitors, and wherein a particular one of said 

1 1 third monitors that monitors said particular one of said second monitors runs on a 
1;2 different node than a node where said particular one of said second monitors runs. 

fil 23. (Original) The system of Claim 22 wherein said hierarchy of monitors includes, 

one or more fourth monitors for monitoring said one or more third monitors and, for any 

^3 particular one of said third monitors that fails, restarting another instance of said 

5 4 particular one of said third monitors, and wherein a particular one of said fourth 

i 3 | monitors that monitors said particular one of said third monitors runs on a differ- 

% % ent node than a node where said particular one of said third monitors runs. 

± 



1 24. (Original) The system of Claim 12 wherein, 

2 said first monitors are one or more agents operating on a first level, each of said agents 

3 for monitoring operations of jobs on nodes where each job is monitored by only 

4 one of said agents, 

5 said second monitors are one or more local coordinators operating on a second level, 

6 each of said local coordinators for monitoring operations of agents, 

7 and wherein said hierarchy of monitors includes, 

8 one or more third monitors for monitoring said one or more second monitors and, for any 

9 particular one of said second monitors that fails, restarting another instance of 

10 said particular one of said second monitors, and wherein a particular one of said 

1 1 third monitors that monitors said particular one of said second monitors runs on a 

\% node where said particular one of said second monitors runs. 

. p* 

m 

f|l 25. (Original) The system of Claim 24 wherein said hierarchy of monitors includes, 

71 one or more fourth monitors for monitoring said one or more third monitors and, for any 

=8 particular one of said third monitors that fails, restarting another instance of said 

A particular one of said third monitors, and wherein a particular one of said fourth 

monitors that monitors said particular one of said third monitors runs on a node 

Ff6 where said particular one of said third monitors runs. 

H 26. (Original) The system of Claim 1 wherein said hierarchy of monitors includes, 

2 one or more third monitors for monitoring said one or more second monitors and, for any 

3 particular one of said second monitors that fails, restarting another instance of 

4 said particular one of said second monitors. 

1 27. (Original) The system of Claim 26 wherein one or more of said second monitors operates 

2 to commit suicide if more than one of said instance of said particular one of said second moni- 

3 tors is restarted. 



1 28. (Original) The system of Claim 26 wherein said one or more third monitors run on differ- 

2 ent ones of said nodes than ones of said nodes on which said second monitors run. 

1 29. (Original) The system of Claim 26 wherein said hierarchy of monitors includes, 

2 one or more fourth monitors for monitoring said one or more third monitors and, for any 

3 particular one of said third monitors that fails, restarting another instance of said 

4 particular one of said third monitors. 

1 30. (Original) The system of Claim 29 wherein said one or more fourth monitors run on dif- 

2 ferent ones of said nodes than ones of said nodes on which said third monitors run. 

I a 31. (Original) The system of Claim 29 wherein said one or more fourth monitors run on ones 

=S of said nodes which are the same as ones of said nodes on which said third monitors run. 

+ 32. (Original) The system of Claim 29 wherein one or more of said third monitors operates to 

=2 commit suicide if more than one of said instance of said particular one of said third monitors is 

3 restarted. 

fflj 33. (Original) The system of Claim 1 having a resource management unit including a load- 
s' balancing for distributing jobs among said nodes. 

1 34. (Original) The system of Claim 1 having a resource management unit including a persis- 

2 tent storage unit. 

1 35. (Original) The system of Claim 1 having a resource management unit including an inter- 

2 face unit. 

1 36. (Original) The system of Claim 1 wherein, 

2 each of said nodes includes a plurality of computers each having an operating system. 
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37. (Original) The system of Claim 1 having a plurality of clusters of said nodes, each cluster 
having a corresponding instantiation of said hierarchy of monitors for monitoring operations in 
the computer system. 

38. (Original) The system of Claim 37 wherein, 

each of said clusters of nodes operates to execute processes organized into a service unit, 
a communication unit and a resource management unit. 

39. (Original) The system of Claim 37 wherein, 

said clusters of nodes are organized into groups, each group having one or more of said 
clusters. 



a first one of said groups is located at a geographic location remote from a second one of 
said groups and said first one of said groups is connected to said second one of 
said groups by one or more networks. 



a first one of said groups is organized to execute on one subset of data and a second one 
of said groups is organized to execute on another subset of data. 

42. (Original) The system of Claim 37 wherein, 

a first one of said groups is organized to execute on one subset of data and a second one 





of said groups is organized to provide backup for said one subset of data. 



Attorney Docket No.: ATAE101 5DEL 

ioi5_oo A o7 A 20.fi.wpd 



Page 87 of 94 



Express Mail Label No.:EL328296286US 

7/20/0-22:31 




43. (Original) The system of Claim 1 wherein, 

said first operations are jobs running on said nodes for providing services, 

said first monitor senses one or more conditions that can cause any particular one of said 

jobs to fail whether or not said particular one of said jobs has actually failed, 
one of said first monitors terminates said particular one of said jobs and restarts another 



44. (Original) The system of Claim 43 wherein, 

said one of said first monitors that terminates said particular one of said jobs restarts said 
another instance of said particular one of said jobs in an environment where said 
one or more conditions are not present. 

45. (Original) The system of Claim 43 wherein, 

said one of said conditions is # node failure and said another instance of said particular 
one of said jobs is started on a different non-failing node. 

46. (Original) The system of Claim 43 wherein, 

said one of said conditions is a job failure and said another instance of said particular one 
of said jobs is started as a new instance of said job. 

47. (Original) The system of Claim 46 wherein, 

said another instance of said particular one of said jobs is started as a new instance of 
said job on a node the same as a node on which said particular one of said jobs was running. 

48. (Original) The system of Claim 46 wherein, 

said another instance of said particular one of said jobs is started as a new instance of 
said job on a new node different from a node on which said particular one of said jobs was run- 
ning. 
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49. (Original) The system of Claim 1 wherein each of said nodes includes a computer and 
wherein new ones of said nodes are added to the system without disturbing the operations of 
other of said nodes in the computer system and wherein jobs are assigned dynamically to said 
new ones of said nodes. 

50. (Original) The system of Claim 1 wherein each of said nodes includes a computer and 
wherein ones of said nodes are removed from ,the system without disturbing the operations of 
other of said nodes in the computer system and wherein particular jobs are reassigned dynami- 
cally to other of said nodes in the computer system. 

5 1 . (Original) The system of Claim 1 wherein each of said nodes includes a computer of one 
type and wherein new ones of said nodes are added to the system including upgraded computers 
of a different type without disturbing the operations of other of said nodes in the computer sys- 
tem and wherein jobs are assigned dynamically from said other of said nodes to said new ones of 
said nodes to provide dynamic upgrade of said system without stopping said particular jobs. 

52. (Original) The system of Claim 1 wherein pluralities of nodes form clusters and wherein 
particular ones of said clusters are assigned for processing particular jobs at particulars times and 
wherein other ones of said clusters are assigned for processing said particular jobs at other times. 

53. (Original) The system of Claim 52 wherein said particular times and said other times are 
follow-the-sun times. 

54. (Original) The system of Claim 1 wherein a delay time is controlled before the restart of a 
job. 

SSL (Original) The system of Claim 1 wherein a delay time is controlled before the restart of a 
job. AnSmterface that allows humans to monitor the health of the system and to log statistics 
about uptime of each component in the system. 
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1 56. (Original) The system of Claim 1 wherein a delay time is applied before said restarting 

2 another instance of said particular one of said first operations. 

1 57. (Original) The system of Claim 1 wherein in said hierarchy of monitors, 

2 said one or more of said second monitors are monitored by at least one of said first moni- 

3 tors and, if any particular one of said second monitors fails, said at least one of 

4 said first monitors, after a first delay time, restarts another instance of said partic- 

5 ular one of said second monitors on a node other than a node on which said par- 

6 ticular one of said second monitors failed. 

_\ 58. (Original) The system of Claim 57 wherein, 

if more than one instance of said another instance of said particular one of said second 

1% monitors is restarted, all but one instance of said another instance of said particu- 

"4 lar one of said second monitors commits suicide. 

s 1 59. (Original) The system of Claim 57 wherein said hierarchy of monitors includes, 

ri one or more additional monitors for monitoring said first monitors and said second moni- 

FI3 tors, and, if any particular one of said first monitors or said second monitors fails, 

% restarting, after a second delay time, another instance of said particular one of 

said first monitors or said second monitors. 

1 60. (Original) The system of Claim 59 wherein, 

2 if more than one of instance of said another instance of said particular one of said first 

3 monitors or said second monitors is restarted, all but one instance of said another 

4 instance of said particular one of said first monitors or said second monitors oper- 

5 ates to commit suicide. 




-3 

1 61 . (Original) The system of Claim 5« wherein said hierarchy of monitors includes, 

2 one or more other monitors for monitoring said first monitors, said second monitors and 

3 said additional monitors, and, if any particular one of said first monitors, said sec- 

4 ond monitors or said additional monitors fails, restarting, after a third delay time, 

5 another instance of said particular one of said first monitors, said second monitors 

6 or said additional monitors. 



1 62. (Original) The system of Claim 61 wherein, 

2 if more than one instance of said another instance of said particular one of said first mon- 

3 itors, said second monitors or said additional monitors is restarted, all but one in- 

4 stance of said another instance of said particular one of said first monitors, said 

5 second monitors or said additional monitors operates to commit suicide. 

63. (Original) The system of Claim 1 wherein, 

=S said first operations are jobs running on said nodes for providing services where a partic- 

F (i ular first one of said jobs associated with a first customer is running on a particu- 

4 lar first node and a particular second one of said jobs associated with a second 

E§ customer is running on said particular first node. 

"5 64. (Original) The system of Claim 1 wherein, 

£2 said first operations are jobs running on said nodes for providing services where a partic- 

3 ular first one of said jobs associated with a first customer is running on a particu- 

4 lar first node and a particular second one of said jobs associated with a second 

5 customer is running on a particular second node whereby said first customer job 

6 is isolated from said second customer job. 
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1 65. (Original) The system of Claim 1 wherein, 

2 said first operations are jobs running on said nodes for providing services where, 

3 particular first ones of said jobs are associated with a first customer with one of 

4 said particular first ones of said jobs running on a particular first node and 

5 with another one of said particular first ones of said jobs running on a 

6 particular other node; 

7 particular second ones of said jobs are associated with a second customer with 

8 one of said particular second ones of said jobs running on a particular sec- 

9 ond node and with another one of said particular second ones of said jobs 
1 0 running on said particular other node. 

JL 66. (Original) The system of Claim 1 including transaction initiators for starting said first op- 
Li 

*2 erations as one or more jobs to initiate a transaction in a service. 

Cli 

?= 

67. (Original) The system of Claim 1 including transaction processors for starting said first 

*~ 

M operations as one or more jobs to process a transaction in a service. 

H 68. (Original) The system of Claim 1 including, 

F U transaction initiators for starting first ones or more of said first operations as one or more 

M first jobs on a first node to initiate a transaction in a service; 

transaction processors for starting other ones or more of said first operations as one or 

5 more other jobs on another node to process said transaction in said service. 

1 ^^OriginalVThe system of Claim 1 including, 

2 transaction^ initiators for starting first ones or more of said first operations as one or more 

3 firstnfobs on a first node to initiate a transaction in a service; 

4 transactit^n processors for starting other ones or more of said first operations as one or 

5 more other jobs on another node to process said transaction in said service. 



1 



f" 



<qS£ T$k (Original) The system of Claim 1 including, 

2 transaction initiators for starting first ones or more of said first operations as one or more 

3 first jobs on a first node to initiate a transaction in a service; 

4 transaction processors for starting other ones or more of said first operations as one or 

5 more other jobs on said first node to process said transaction in said service. 

1 7^ J7f7 (Original) In a fault tolerant computer system operating to execute one or more jobs on 

2 one or more nodes where the computer system includes a hierarchy of monitors for monitoring 

3 operations in the computer system, the method comprising, 

4 monitoring first operations with one or more first monitors and, for any particular one of 

5 said first operations that fails, restarting another instance of said particular one of 
f 4 said first operations, 

jjf monitoring said first monitors with one or more second monitors and, if any particular 

\% one of said first monitors fails, restarting another instance of said particular one of 

said first monitors. 

s 1 (Original) The method of Claim^wherein, 

t% monitoring said one or more of said second monitors with at least one of said first mon- 

itors and, if any particular one of said second monitors fails, restarting with said 
£;4 at least one of said first monitors another instance of said particular one of said 

*3 second monitors. 

nl 

1 (Original) The method of Claim^Wherein one or more of said second monitors operates 

2 to commit suicide if more than one of said another instance of said particular one of said second 

3 monitors is restarted. 
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